Effect of analysis strategy on reproducibility of detected fMRI activations 
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1 Introduction 

The primary goal of image analysis in functional 
magnetic resonance imaging (fMRI) activation stud¬ 
ies is usually to detect and delineate the image areas 
that have a signal intensity time course, which can be 
related to the experimental parameters. The task is 
challenging because the images are noisy and often 
corrupted by motion. The problem is typically solved 
using a statistical testing procedure. First, a statistical 
parametric map (SPM) is created. Thereafter, either 
a non-active or active state is assigned to each voxel. 
We will call this the segmentation phase, as the goal 
of this step is to divide the statistical parametric map 
into non-active and active regions. 

The statistical parametric maps are often computed 
using the general linear model that subsume for ex¬ 
ample the simple t test [1,2]. A commonly used seg¬ 
mentation method is intensity thresholding. The vox¬ 
els whose statistical value is larger than a predefined 
threshold, directly related to the significance level of 
a statistical test, are classified as active. 

One way to improve the segmentation is to utilize an 
assumption that the probability of the activation is re¬ 
lated to the existence of activation in the spatial neigh¬ 
borhood of a voxel. This a priori assumption can be 
incorporated into the classification procedure by us¬ 
ing so called contextual methods. 

Previously, a computationally efficient contextual 
clustering algorithm for the segmentation of statistical 
parametric maps was introduced [3]. In the contex¬ 
tual clustering algorithm, both statistical parametric 
values and classification information from the neigh¬ 
borhood of each voxel are iteratively used to make a 
decision whether a voxel is active or not. 

The goal of the present study was to analyze the re¬ 
producibility of segmentation. Especially, the effect 
of contextual information on the reproducibility was 
studied. Additionally, we studied if a reproducibility 
index could be used to choose the segmentation pa¬ 
rameters in an objective way. 


2 Materials and Methods 

2.1 Subjects and data acquisition 

MR imaging was performed with a Siemens Vision 
1.5 T MRI scanner (Siemens AG, Erlangen, Ger¬ 
many) at the Department of Radiology, Helsinki Uni¬ 
versity Central Hospital. 

Functional MR images were acquired using a 
gradient-echo echo-planar imaging sequence (EPI) 
(TE 70 ms, TR 2.083 s, flip angle 90°, field of view 
256 mm, matrix 64 x 64, 16 slices, slice thickness 3.0 
mm, gap 1.0 mm). Four motor experiment studies 
were performed on a right-handed healthy volunteer. 
To minimize head movement, a head-supporting vac¬ 
uum cast was used. The magnetic field was globally 
shimmed prior to imaging. 

In the experiments, volunteer was instructed to flex 
his right wrist during the presence of the character + 
on a screen and rest during the presence of the char¬ 
acter x. The paradigm consisted of four rest and four 
motor execution blocks, lasting 15 scans each. In 
addition, prior to the first rest epoch of each study, 
8 scans were acquired to allow the MRI to reach a 
steady state of longitudinal magnetization. 

2.2 Preprocessing and computation of maps 

Statistical parametric maps were computed using the 
SPM99 software (Wellcome Department of Cogni¬ 
tive Neurology, London, UK) following the guide¬ 
lines for a basic statistical analysis. First, the images 
were realigned in order to remove movement-related 
variance components. Sine interpolation was used in 
the transformation. The statistical parametric maps 
it maps) were computed by using the general linear 
model. The linear model for the signal was speci¬ 
fied to be the fixed box-car function convolved with 
model hemodynamic response function (hrf). Esti¬ 
mated movement parameters were used as covariates 
in the analysis. Serial correlations were dealt with 
temporal filtering of the data. Cut-off period of the 
high-pass filter was set to 125 s. Temporal low-pass 
filtering was done to allow for a proper assessment 



of degrees of freedom. Finally, the t maps were con¬ 
verted to z maps. 

2.3 Segmentation of z maps 

We follow here the notation in which the activations 
have a positive mean. In the thresholding, a voxel i is 
classified as active if 

Zi>T (1) 

where zi is the z value of a voxel i and T is a pre¬ 
defined intensity threshold. Otherwise, the voxel is 
treated as non-activated. 

The contextual clustering rule used in this paper is as 
follows [3]: First, as an initialization step the clas¬ 
sification is initialized using Eq. (1). After the ini¬ 
tialization, the voxels are re-classified. In the re¬ 
classification, a voxel is considered as active if 

Zi + J,{ui - N n /2) > T (2) 

and otherwise as non-active. Constant N n is the 
number of neighbors under consideration. By defin¬ 
ing the neighborhood to consist of 26 closest voxels, 
N n = 26. Variable U{ is the number of currently ac¬ 
tive neighborhood voxels of voxel i . The parameter (3 
determines the weighting of neighborhood informa¬ 
tion, and when positive, encourages neighbors to be 
of like class. One way to set the (3 is to write 



Parameter s is a user specified parameter and may 
have any real positive value. Intuitively, s can be 
understood as a required excess of activated voxels 
(= Ui—N n / 2) in the neighborhood of voxel i to trans¬ 
form a value of zero to the level of T. Classification 
rule (2) is repeated until convergence or oscillation 
between states occurs. During each iteration the clas¬ 
sification and ui s are updated. It is easy to see that as 
s approaches oo the contextual clustering approaches 
the voxel-by-voxel thresholding technique. As s ap¬ 
proaches zero, the contextual clustering approaches a 
recursive majority vote classification. 

The data sets were contextually segmented with 
parameter values T = 0.1,0.2... 7.0 and s = 
0.5, 2, 6,10, 20,50. In addition, the segmentation was 
performed using the voxel-by-voxel thresholding. 


2.4 Reproducibility 

Reproducibility of the segmentation was studied by 
comparing the analyzed images of four motor acti¬ 
vation studies. Segmented (i.e. binary) maps of all 
four studies were summed up voxel by voxel. Voxel 
value of the sum images, R q , represents the number 
of studies (from zero to four) in which the voxel % is 
classified as active. This kind of a sum image has 
been named as a reliability map [4] and we will adopt 
the term here. The reliability map gives an idea about 
the reproducibility of the method. If all the non-zero 
voxels have maximum value (in this case the value of 
four) the results have been perfectly reproducible. If 
all the non-zero voxels have a value of one the results 
have not been reproducible at all. 

In order to quantitatively compare the effect of seg¬ 
mentation parameters on the reproducibility a mea¬ 
sure of the reproducibility, Rm, was defined. We de¬ 
fined the reproducibility index R' m to be the mean 
value of all non-zero values of i^. 

The reliability maps were transformed to the coordi¬ 
nates of the anatomical MR image sets for the visual¬ 
ization purposes using slice-positioning information 
available from the image file headers. 

3 Results 

Figure 1 shows the reproducibility index achieved 
with different segmentation parameters. It is seen that 
as s increases, the contextual clustering approaches 
voxel-by-voxel thresholding technique. This is ex¬ 
pected result because the contextual clustering ap¬ 
proaches the thresholding when s — >• oo. Most of the 
curves have two local maxima. The first maximum 
is obtained with low T values. It can be explained 
easily by the fact that with small T values the prob¬ 
ability of false activation is so high that it is likely 
that a voxel is classified falsely as active also in other 
experiments. For example, if thresholding were done 
using very small T (T — ► — oo), then all voxels would 
be classified as active and the reproducibility index 
would be R m — 4. Hence, the first maximum is not 
interesting. At the second maximum, the probability 
of false activations is small and the detected activa¬ 
tions are likely caused by true activations. This latter 
maximum is the interesting one. 

The highest reproducibility index R^ = 3.1 was 
achieved with parameter combination s = 2, T — 
1.4. As s was increased, the maximum of repro¬ 
ducibility indexes over the T values decreased, being 



R m = 2.4 for the thresholding (T = 5.1). 

Figure 2 shows two reliability maps superimposed on 
the anatomical MR images. The value Ri in the reli¬ 
ability maps is the number of studies (from 0 to 4) in 
which the voxel was classified as active. 

It should be noted that the images shown in Fig. 2 do 
not have same false-positive rates, or significance lev¬ 
els. T values were chosen so that Rm was maximized 
with given s. A simple simulation experiment was 
performed to assess the statistical significance of the 
results. 50000 simulated statistical parametric maps 
(64 x 64 x 16 voxels in size) without activations were 
created and segmented. With thresholding (T = 5.1) 
at least one (falsely) active voxel was detected in 542 
images (545 active voxels in total). With contextual 
clustering using values s — 20, T = 3.1 an activa¬ 
tion was detected in 499 images (502 active voxels 
in total). With s — 2,T = 1.4 no activations were 
detected. This result indicates that the activations of 
Fig. 2(b) were obtained by using much stricter crite¬ 
rion than in the Fig. 2(a) or Fig. 2(c). However, the re¬ 
sults of the simulations should be seen only as rough 
approximations as the simulated data were sampled 
from a pseudo-random standard normal distribution 
and the voxels were spatially uncorrelated. 
Activations of primary sensorimotor cortex were de¬ 
tected in all cases of Fig. 2. Larger number of Ri = 4 
voxels (voxels active in all 4 experiments) were de¬ 
tected when contextual information was utilized. Ac¬ 
tivations of the supplementary motor area (SMA) 
were not detected with 5 = 2 (Fig. 2(b)). Apparent 
explanation is the strict statistical criterion. 

3.1 Discussion 

We have studied the reproducibility of the fMRI ac¬ 
tivation patterns. Especially, the effect of the contex¬ 
tual information was studied. If functional MRI is 
used as a physiological constraint for electromagnetic 
inverse problem, it is of great importance that activa¬ 
tions areas are delineated reliably [5]. Our view is 
that the higher reproducibility of the fMRI localiza¬ 
tion results increases the possibility of using fMRI as 
a constraint in MEG. 

There are multiple sources of variation that limit the 
possibilities to obtain reproducible results. The ther¬ 
mal noise follows the normal distribution well and ex¬ 
ists independently in neighboring voxels. Therefore 
the effects of this noise source can be handled sta¬ 
tistically. The movements are significant sources of 
variation. In this study, we used head supporting vac¬ 


uum cast and realignment algorithm in order to reduce 
the effect of these variation sources. Magnetic field 
instabilities [6] and physiological effects may cause 
drifts and temporal or spatial correlations to the data. 
Therefore we filtered the data in temporal domain and 
used reduced degrees of freedoms. Some variation 
may arise from the timing errors between the stim¬ 
ulus signal, imaging sequence and actual movement 
execution of a hand. The variation caused by these 
timing errors is believed to be negligible. 

The contextual clustering method uses as an input 
only the z map (e.g. 64 x 64 x 16) instead of the whole 
data. Hence, the time required to run the contextual 
clustering algorithm is only a few seconds with a 500 
MHz Pentium III. 

In this short paper we have not studied the effect of 
segmentation parameters to the false positive rates, 
sensitivity or segmentation accuracy. These issues 
were studied in [3]. In the statistical testing tech¬ 
niques, the parameter values (e.g. thresholds) are cho¬ 
sen so that a desired false-positive rate is achieved. 
There are a few problems associated with the testing 
approach. The first one is that the selection of ac¬ 
cepted false-positive rate is somewhat arbitrarily. The 
second problem is that the accurate determination of 
decision parameter values is difficult due to spatial 
autocorrelations and other non-idealities in the data. 
Reasonable results were obtained when the parame¬ 
ters were chosen so that the defined reproducibility 
index was maximized. This suggests that it might be 
possible to use the reproducibility index as an objec¬ 
tive criteria on to choose the parameter values for a 
segmentation algorithm. The obvious drawback of 
the approach is that in order to compute a repro¬ 
ducibility index an experiment must be repeated or 
one experiment divided into parts. In addition, rea¬ 
sonable results are expected only if there are truly ac¬ 
tive regions in the data that can be detected. Hence, 
the reproducibility index used as a method to choose 
the decision parameter can be seen as a complemen¬ 
tary activation visualization method after the exis¬ 
tence of activations has been statistically reasoned. 
The main result this study was that the use of spa¬ 
tial neighborhood information information increases 
reproducibility. According to the results, highest re¬ 
producibility indices were achieved when high weight 
was given to contextual information (5 = 2). How¬ 
ever, in order to preserve small activation regions and 
better segmentation accuracy [3] it may be useful to 
use somewhat smaller weights. 
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Figure 1: Reproducibility index, R m , obtained with 
varied parameter values s and T. R m is defined as a 
mean of non-zero Ri values. The Ri value of a voxel 
is the number of studies (from 0 to 4) in which the 
voxel is classified as active (see Fig. 2). 
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Figure 2: Reliability maps. The colored voxels rep¬ 
resent the number of studies (Rf) in which the voxel 
was classified as active, (a) Voxel-by-voxel thresh¬ 
olding T = 5.1, (b) Contextual clustering s = 2, T = 
1.4, (c) s = 20, T = 3.1. With contextual informa¬ 
tion the proportion of Ri = 4 voxels increased. 




















